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NARROW ARITHMETIC PROGRESSIONS IN THE PRIMES 


XUANCHENG SHAO 


Abstract. We study arithmetic progressions in primes with common differences as small 
as possible. Tao and Ziegler showed that, for any k > 3 and N large, there exist non-trivial 
fc-term arithmetic progressions in (any positive density subset of) the primes up to N with 
common difference 0((log for an unspecified constant Lfe. In this work we obtain this 

statement with the precise value = {k — 1)2^“^. This is achieved by proving a relative 
version of Szemeredi’s theorem for narrow progressions requiring simpler pseudorandomness 
hypotheses in the spirit of recent work of Conlon, Fox, and Zhao. 


1. Introduction 

A central problem in additive nnmber theory concerns finding in the set of primes varions 
linear patterns, snch as fc-term arithmetic progressions (fc-APs) for k > 2. The gronndbreak- 
ing work of Green and Tao shows that any positive density snbset of the primes contains 
infinitely many k-APs. 

Theorem 1.1 (Arithmetic progressions in primes). Let k > 2 be a positive integer and h > 0 
be real. Let N be sufficiently large depending on k and 5. Then any subset A cVn [A] with 
|A| > SN/log N contains a nontrivial k-AP. 

Here V denotes the set of primes, [N] denotes the interval {1,2,--- ,N}, and a k-AP 
is called nontrivial if its common difference is nonzero. Recall Szemeredi’s theorem, which 
asserts the existence of k-APs in dense snbsets of the integers. Since the set of primes has 
density zero in the integers, Szemeredi’s theorem does not immediately imply Theorem 11.11 
The main idea in [7] , now referred to as the transference principle, is then to place the set of 
primes densely inside a superset of “almost primes”, and to show that this superset satisfies 
certain pseudorandomness hypotheses so that it behaves just like the set of all integers. 

Theorem 1.2 (Relative Szemeredi’s theorem). Let k >2 be a positive integer and 6 > 0 be 
real. Let N be prime and sufficiently large depending on k and 5. Let G = 'Ll NT, and let 
/, z/ : G ^ M be functions satisfying 0 < / < z/. Suppose that v satisfies the k-linear forms 
conditions, and that E/ > 5. Then A(/, ■ ■ ■ , f) > c for some constant c = c{k, S) > 0. 

Here E/ and Ez/ denotes the average value of / and u, respectively, and the counting 
function A(/i, ■ ■ ■ , fk) is defined by 

A(/i, ■■■ ,fk) = EneGEdeG/iH/2(n + d) ■ ■ ■ fk{n + {k - l)d) 
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for functions /i, • • • , /fc : G —)■ M. 

Recently Conlon-Fox-Zhao [T] found a simpler proof of Theorem 11.21 using a sparse hy¬ 
pergraph regularity lemma, which also has the pleasant consequence of weakening the linear 
forms conditions that the majorant v must satisfy. For the precise definition of these linear 
forms conditions, see Definition 12.11 below and the remarks following it. 

The main goal of this paper is to find k-APs in primes with common difference as small 
as possible. This problem of finding narrow progressions in the primes has been studied by 
Tao and Ziegler [Tallin]. In fact, they studied the much more general problem of finding 
narrow polynomial progressions of the form a + Pi{d), - ■ ■ ,a + Pk{d), where Pi, • • • , Pfc are 
polynomials satisfying Pi(0) = • ■ • = Pfc(O) = 0, and showed that the step of these progres¬ 
sions d can be taken 0((logiV)^) for some constant L > 0 (depending only on Pi, • • • , P^). 
Moreover, they remarked that, in the case of arithmetic progressions, L can be taken to be 
Ck2’^ for some absolute constant G > 0 by following their arguments specialized to APs. 
Our main result conhrms this remark, and moreover gives a precise value of the exponent L, 
which we will argue is optimal under current technologies. 

Theorem 1.3 (Narrow arithmetic progressions in primes). Let k > 2 he a positive integer 
and 6 > 0 be real. Let N be sufficiently large depending on k and 5. Then any subset 
A G V n [A^] with |A| > 6N/log N contains a nontrivial k-AP with common difference d 
satisfying \d\ = Ok,si0-Og for any e > 0, where Lk = {k — 1)2^“^. 

Just as Theorem 11.11 is deduced from Theorem 11.21 Theorem 11.31 will be deduced from the 
following relative version for narrow progressions. 

Theorem 1.4 (Relative Szemeredi’s theorem for narrow progressions). Let k > 2 he a 
positive integer and 6 > 0 be real. Let N be prime and sufficiently large depending on k and 
6. Let G = Z/A^Z and let f, u : G ^ M. be functions satisfying 0 < f < u. Let D, S > 2 be 
positive integers satisfying S = o{D). Suppose that v satisfies the k-linear forms conditions 
with width S, and that E/ > 6. Then Ad(/, •••,/)> c for some constant c = c{k, S) > 0. 

Here the counting function Ao(/i, • • • , fk) is defined by 

Ad(/i, ■■■ ,fk) = ^neG^de[D]fi{n)f2{g + d) ■ ■ ■ fk{n + {k - l)d) 

for /i, • • • , /fc : G —)■ M, and the interval [D] is embedded in G in the obvious way. 

See Definition 12.11 below for the precise definition of the fc-linear forms conditions with 
width S, which are analogues of the fc-linear forms conditions needed in Conlon-Fox-Zhao’s 
work [1] in the narrow setting. 

Remark 1.5 (The exponent Lk). If the set V in Theorem 11.31 is replaced by a random subset 
of [A^] with density 1/ log A^, then the statement holds with Lk replaced by A; —1 almost surely 
(see [Ml Proposition 2]). On the other hand. Theorem 11.31 fails if the exponent is smaller 
than k — 1 (see m Proposition 1]). In Remark 12.31 below, we will see that (the normalized 
characteristic function of) a random subset of [A^] with density a satisfies the fc-linear forms 
conditions with width and moreover the exponent Lk here is optimal. Thus if 
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one tries to prove Theorem 11.31 with a smaller value of Lk via a transference principle, it is 
necessary to seek for even more simplihed linear forms conditions than those in pQ. 

One might ultimately be interested in the case when ^4 = P is the set of all primes. 
The Hardy-Littlewood conjecture implies that there are inhnitely many nontrivial fc-APs 
in primes with common difference 0^(1)- This is only known unconditionally in the case 
k = 2 thanks to recent breakthroughs by Zhang im and by Maynard [12] (and by Tao 
independently), which asserts that there are inhnitely many pairs of primes with bounded 
gap. The Hardy-Littlewood conjecture also predicts an asymptotic formula for the number of 
fc-APs in primes up to A^ of a given common difference. For the problem of counting all /c-APs 
in primes (without any restrictions on d), and indeed for counting any linear pattern with 
finite complexity^ such an asymptotic formula is established in [S] (with a crucial ingredient 
in 0 )- Finally, one could also ask for asymptotic formulas of this type with the Liouville 
function A or the Mobiiis function /i (in which case the main term should be zero). Strong 
results of this type are recently established by Matomaki-Radziwill-Tao m in the case k = 2. 
They showed that 

Ede[D] |E„e[Ar]p(n)/i(n + d)\ = o(l) 

as soon as P —)■ cxo, with a crucial input from in regarding multiplicative functions in (very) 
short intervals. 


2. Outline of proof 

Conventions. Throughout this paper we £x the positive integer k >2. We always work in 
the cyclic group G = Z/A^Z, where N is always assumed to be prime and sufficiently large. 
An integer n is also viewed as an element in G in the natural way. We use o(l) to denote a 
quantity that tends to zero as A^ —)■ cx). For a vector s, we always use Si, S 2 , • ■ • to denote 
its coordinates. Similarly, a vector for some r G {0,1} has coordinates S 2 \ • • •, and 
a vector for some oj = (cui, a; 2 , • • •) has coordinates S 2 ^^\ • • • ■ 

In this section we state the main ingredients in the proof of Theorems 11.31 and 11.41 We 
start by defining the fc-linear forms conditions appearing in the statement of Theorem 11.41 
(compare with [T] Definition 2.2]). 

Definition 2.1 (Linear forms conditions). Let A; > 2 be a positive integer. Let N be prime 
and let G = Z/A^Z. Let S' > 2 be real. We say that a function u : G ^ M. satisfies the 
fc-linear forms conditions with width S if the following conditions hold. 

(1) For any convex body D C with inradius r(f2) > S and D C 

[_^(^)0(l),^(^)0(l)]2fc^ 

we have 

k 

EneGE(s(o),s(i))gnnz2^ H 11 ^ [n + = 1 + o(l). 
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for each choice of e{j,u}) G {0,1}, where 'ipj : —>■ Z is the linear form dehned by 

k 

(2.1) ,Sk) = ^^0-i)si. 

i=l 

(2) For any convex body C with inradius r(f2) > S and C 

we have 

E„gGE(s(o),s(i))Gnnz2fc H ^ = 1 + o(l), 

for each choice of e{u) G {0,1}, where -0 : Z^ —)■ Z is the linear form dehned by 

k 

(2.2) 0(si,-- - ,Sk) = kiy^^Sj. 

i=l 

(3) For any convex body 12 C with inradius r(12) > S and 12 C [—r(12)*^’^^), r(12)‘^’^^)]^, 
and any 1 < j < fc, we have 

EneGi^inY n n = ^ + 

(d(0),d(l))Gr2nZ2 l<i<fcrG{0,l} 
i¥=j 

for each choice of e, e{i, r) G {0,1}. 

In the hrst condition, since ijjj does not depend on the jth variable, 0^(8*^^^^) makes sense for 
uj G {0, As explained in [1] Section 2.2], the hrst set of these linear forms conditions 

occur quite naturally, corresponding to 2 -blowups of triangles in appropriate hypergraphs. 
These blowups are eventually responsible for the extra factor of 2^“^ in the exponent L^. 
The presence of the other linear forms conditions are purely technical, coming from extra 
manoeuvres required to deal with the narrow nature of the progressions. However, the value 
of Lk depends critically on only the hrst set of conditions. 

Example 2.2. When k = 3, the hrst condition in the 3-linear forms conditions are saying 
that the product of the following 12 terms: 

u{n-X2-2x3), u{n - X2 - 21/3), ^{n - y2 - 2x3), i/(n - j/2 - 2J/3), 
v{n + Xi-X3), v{n + yi-X3), z/(nXi - 2/3), v{n + yi-y3), 

v{n + 2xi + X 2 ), v{n + 2yi + X 2 ), v{n + 2xi + y 2 ), v{n + 2yi + y 2 ), 

when averaged over n E G and {xi,X 2 , X 3 , yi, j/ 2 , 2 / 3 ) G 12 flZ®, is equal to 1 -|- o(l). The same 
holds for the product of any subset of these 12 terms. 

The proof of the relative Szemeredi theorem for narrow progressions (Theorem II.4p will 
be carried out in Sections [3l9l While the proof of its global analogue (Theorem II.2p in [T] 
proceeds by passing to the corresponding counting problem in hypergraphs, we are unable 
to hnd a good graph model for counting narrow progressions. We thus proceed entirely in 
the arithmetic setting, motivated by the work of Zhao [T 8 ] . 
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Remark 2.3. Now that the fc-linear forms conditions are precisely dehned, let us explain why 
any majorant u for the primes should not satisfy the /c-linear forms conditions with width S 
below (logA^)^''. We illustrate this with the example k = 3 and L 3 = 4, and recall the linear 
forms in Example 12.21 Consider the contribution from those terms with xi = yi. Under 
this restriction, four pairs of these linear forms take the same values. Since should have 
average about logiV, the average over all terms with xi = yi should have size about (logiV)^. 
Thus if S is smaller than (logA^)^, these contributions will dominate and the linear forms 
conditions fail. Similarly, for general k, the restriction xi = yi creates = {k — l)2^~‘^ pairs 
of linear forms having the same value, and thus S must be larger than (logiV)^''. The same 
argument also shows that, if v is the normalized characteristic function of a random subset 
of [N] with density a, then it does not satisfy the fc-linear forms conditions with width S 
below 


This remark motivates the following dehnition. 


Definition 2.4. Let T = (-^i, • • • ,'^i) : ^ Z* be a system of distinct affine linear forms 

in d variables x = (xi, • • • , Xd). For any / C [t], let T/ = {'ipi : i E 1} and dehne 

n(T/) = {x G : '0i(x) = ‘ipji'x) whenever i,j G /}. 


Furthermore, for any partition tt of [f] (so that vr is a collection of disjoint subsets of [f] whose 
union is [t]), dehne 

n(vi/,7r) = f|n(vi/,). 

/Gtt 


Finally, dehne 


L(T) = sup 
kl<i 


f — |7r| 

codim n(4/, vr) ’ 


where | 7 r| denotes the number of subsets in the partition vr, and the supremum is taken over 
all partitions vr of [t] with | 7 r| < t. 


The denominator codim n(\k, vr) is the smallest number of independent linear conditions 
on xi, • • • ,Xd needed to create a linear subvariety on which linear forms from the same atom 
of vr are identical. By convention we set codim 11(4/, vr) = 00 if 11(4/, vr) = 0. Since T consists 
of distinct linear forms, this codimension is positive whenever | 7 r| < t. If 4/^ is the collection 
of linear forms appearing in the hrst set of fc-linear forms conditions, then L{^k) > by 
Remark 12.31 We will show in Section [ 6 ] that equality holds. 


Proposition 2.5. Let k >2 be a positive integer, and let T be the system of linear forms ap¬ 
pearing in the first condition in the k-linear forms conditions. More precisely, T is the collec¬ 
tion of linear forms ifj in 2k variables = {sf\ ' 1 o.nd • • • , 

of the form 

for some 1 < j < k and u G {0, where fjj is defined in fl2.ip . Then T(T) = Lk = 

(fc- l)2^-2. 
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This proposition explains the occurrence of Lk in Theorem II.31 In principle we also need to 
evaluate T(T) for the systems T appearing in the second and the third set of fc-linear forms 
conditions. These tasks are much easier. A moment’s thought reveals that T(\[') = 
when T is the system in the second set of conditions, and T(T) = k — 1 when T is the 
system in the third set of conditions. Both values are at most {k — 1)2^“^ when k > 3. 

In order to apply Theorem 11.41 we also need a majorant u for the (IT-tricked) primes 
satisfying the fc-linear forms conditions. The idea of using a smoothly truncated version of 
Selberg’s weight was hrst consider by Goldston-Pintz-Yildirim [HE]. See also [HI Appendix 
D] and the note [TT]. In Sections OEl we will review the basic properties of this majorant 
and prove that it satisfies the fc-linear forms conditions. 


Proposition 2.6 (Pseudorandom majorants). Fix a positive integer to. Let N be prime and 
sufficiently large, and let G = 'LjN'L. Let w < 0.1 log log Y be a slowly growing function of 
N, and let W = np<ioP- Take any reduced residue class b (mod IT). There exists a function 
^ = ^w,b : G —)■ M satisfying the following conditions. 

(1) Vwfiffi) > 0 for any n and moreover 




cffiW) \ogN 
W 


for some constant c = c{to) > 0, whenever Wn + b is prime and Wn + b > . 

(2) For any system of distinct affine linear forms T = (-01, • • • , fjt) with t < to, 

and any convex body hi C M'’* with inradius r(r2) and hi C [—r(r2)‘^‘^^\ such 

that r(r2) > g{N){\ogNffi^'^^ for a function g satisfying g{N) oo as N ^ oo, we 

have 

t 

EnecExeonzrf = 1 + Ovi/;Ar^oo(l)- 

i=l 


In view of Proposition 12.51 and the remark following it, this implies that the function 
vw,b satisfies the fc-linear forms conditions with width g{N){\ogNffi^. We now have all the 
ingredients needed to deduce Theorem 11.31 


Deduction of Theorem \1.3\ from Theorem I .4 assuming Propositions \2 .51 and \2. (A We may 
assume that k > 3, as the statement is trivial when k = 2. By a diagonalization argu¬ 
ment, it suffices to prove the statement with \d\ < g{N){\ogN)^'^ for any slowly growing 
function g and large N. Let w = w{N) < 0.1 logg{N) be a slowly growing function and 
let IT = Y[p<wP^ ^ Choose a prime N' e [2N/W,AN/W], and let 

G = Z/Y'Z. By the pigeonhole principle, we may choose a reduced residue class b (mod IT) 
such that 


(2.3) Ar]{p e V : p = b (mod IT) and p > > 


6N 


(p{W) \ogN 


WV2 > 


6N 


2ip{W) \ogN 
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Let V = i>w,b : G —)■ M be the majorant from Proposition I2.6[ and let / : G —?• M be the 
function dehned by 


/W 


cy(wpogjv' Wn + beAfm<lWn + b> 
0 otherwise, 


where c = c{k) > 0 is sufficiently small. Then 0 < f < u. Moreover, from fl2.3|) we obtain 


Enec/W > 


1 


c^{W) logN 
W 


6 N ^ ^ 

2 lp{W) log “ 10’ 


Set S = 5f(A^)^/"‘(logiV)'^'' and D = [5f(iV)^GQogiV)^'=J. Since v satishes the fc-linear forms 
conditions with width S (see the remark following Proposition 12.61) . we may apply Theorem 
11.41 to conclude that Ad(/, • • • , /) 3>5 1. In other words, there exist fc-APs n, n + d, • • • , n + 
{k — l)d with 1 < d < H such that each n + jd (0 < j < fc — 1) lies in the support of /. 
Each such fc-AP gives rise to a fc-AP Wn + 6, hP(n + d) + 6, • • • , hP(n + (fc — l)d) + 6 in A, 
with step Wd <WD < g{N)(\ogN)^*‘^ as desired. □ 


3. The truncated von-Mangoldt function and the prime majorant 


We construct the majorant v required in Proposition 12.61 as follows. Let R < 
a parameter and let y : M —)■ M be a smooth function supported on [—1,1]. Assume that 
x(0) ^ 1/2 and moreover 

/ OO 

■OO 

Dehne the truncated von-Mangoldt function Rx,R 'wifh parameter R and the smooth cutoff 
y by the formula 

(3.2) A^,^(u) = log i? ^ /i(d)y 

d\n 



Note that if n is prime and n > R, then Ax,R{n) = y(0)logi? > (logi?)/2. Dehne the 
majorant : G —>■ M by the formula 

(3-3) i^x,R,w,b{n) = ppiQg b^. 

It is clearly non-negative, and satishes 


(3.4) 


^X,R,W,b{^) > 


^{W) logR 

w 


whenever Wn + b is prime and Wn + b > R. The smoothly truncated nature of y allows 
us to obtain precise asymptotic formulas for correlation estimates involving A^^R- First we 
need some dehnitions. 
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Definition 3.1 (Singular series). For a vector h = {hi, • • • , hk) G we define the singular 
series 

6(h) 


p 


^ _ Vp{\i) 


p 


where r = #{hi, • • • , hk} and z^p(h) is the number of residue classes modulo p occupied by 
elements in hi, • • • , hk- For a positive integer W, define also the IF-tricked singular series 

i/p(h)' 


sw{H )= n 1 - 


p\W 


P 


1 - 


P 


Definition 3.2 (Sieve factor). Let y; : M —)■ M be a smooth compactly supported function. 
For any positive integer m, we define the sieve factor 

Y[ij{tj)dtj, 


^Xi'^ 


JM. \jel / 


6=1 


where the function : M —>■ M is defined by the relation 

/ OO 

^{t)e-^^^dt. 

-OO 

More generally, for a vector h = (hi, • • • , hk) G define the sieve factor 

Cx(h) = n 

,hk} 

where m{h) = ij^{l < i < k : hi = h}. 


We will not directly need the precise definition of apart from the fact that = 1, 
a consequence of the normalization fl3.ip . 

Proposition 3.3 (Correlation estimates for A^/j). Let N,W be positive integers and 
let b (mod W) be a reduced residue class. Let be defined as in fl3.2p . Let h = 

(hi, h2, • • • , hk) G Z^. Then 

(c,(h)6»'(h)+o(i5(h)))(logfl)‘-’'+0(/Vfi‘(logfl)*), 

where r = #{hi, • • • , hk}, and 


E. 


n<N 


llAi,,R{W{n+h,)+b) 


i=l 


B(h) = exp O 



A(h)= n (h,-h,). 

l<2<J<fc 

hi^hj 


with 

(3.5) 
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Proof. When W = 1, this is exactly the main result in [13]. The general case follows from 
straightforward adaptation of the argument there. 

In Section m we establish some auxiliary results concerning average values of singular series, 
used to understand averages of Qwif^) and £'(h) as h varies. In Section 15] we will then prove 
that the function satishes the required correlations estimates in Theorem 12.61 

4. Average of the singular series 

In this section, we prove an auxiliary result on the average of singular series appearing in 
Proposition |3.31 This is a generalization of a result of Gallagher [3] (see also 0). 

Proposition 4.1. Let w > 1 be a parameter. For each prime p, let 5'p(h) = gp{hi, ■ ■ ■ ,ht) 
be a function with gp> 1 such that the following conditions hold: 

(1) gpih) = 1 + 0{p~^) for any h G If; 

(2) fi'p(h) = 1 + o\p~‘^) if p\ A(h), where A(h) is defined in fl3.5p ; 

(3) fi'p(h) = 1 whenever p < w. 

Define g : If ^ M. by the (absolutely convergent) infinite product 

£/(h)= 

p 

Let Pi <Zl^ be a (multi)set. Then for any Q > 2 and e > 0, we have 

EhGWfi'(h) = 1+ + 0 [ V I A(h)) ] +0^ rQ"^max|A(h)|*' 

for some constant C = 0(1), where u{q) is the number of prime divisors of q, and P\^^-u{q \ 
A(h)) is the probability that q \ A(h) when h is chosen uniformly at random from PL. 

Proof. Dehne a new function g' by the hnite product 

^'(h) = n 

p|A(h) 

Since 

9(h) = JJ c/p(h) • n (1 + 0(P~^)) = £/'(h)(l + 0(u;-^)), 

p|A(h) p>w 

it suffices to prove the proposition for g'. From now on we thus assume that 5'p(h) = 1 
whenever p \ A(h). 

For hxed h E If, dehne a multiplicative function a\^(q) supported on squarefree integers q 
by the formula 

ah(g) = 

pl<? 



□ p 
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Then aii{q) is non-negative and vanishes unless q \ A(h). Moreover, 

^(h) = n (l + «h(p))= ®h(g)- 

p|A(h) 

Since a^ip) = 0{p~^) by hypothesis, we have 


q|A(h) 


ah{q) < 


(juj(q) 


for some C = 0(1), and thus 




g|A(h) 

q>Q 


g|A(h) 

q>Q 


q\A{h) 


Q ’ 


where the last inequality follows from the identity 

^ /i2(g)0‘"('') = JJ (1 + O) = (1 + 

q\A{h) p|A(h) 

and the bound Ci;(A(h)) = o(log |A(h)|). Hence, 


90^)= Y1 «h(g) + 0(Q ^|A(h)|^). 

q\A{h) 

q<Q 

Average the above equation over h E T-L. The q = 1 term contributes 1 since ah(l) 
any h. If 1 < g < tc, then a^iq) = 0 for any n. For w < q < Q, we have 

(juj{q) 

EhG^ah(g) < - Pheuiq I A(h)). 

q 

This completes the proof. 


1 for 


□ 


We will apply this proposition twice, to deal with the main term Qy/{\i) and to handle the 
error term E{h). 


Example 4.2. li g = Qw (recall Dehnition 13.11) . then 

A _ iy"'“ A yp(hu---,h.) 

\ p) V P 

for p \ W, and 5'p(h) = 1 for p | W. It clearly satishes the assumptions (1) and (3) in the 
statement of Proposition 14.11 If p f A(h), then |{hi, • • • , ht}\ = iy,(hi, • • • , hj), and thus 
Pp(h) = 1 -|- 0(p“^), which verihes the assumption (2). 



Example 4.3. \i g = E (dehned in the statement of Proposition 13.3p . then 

p|A(h) 

1 Pt^(h), 
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for some constant Cp = 0(1). It clearly satisfies all the assnmptions in the statement of 
Proposition 14.11 with tc = 1. 

If the set "H eqnidistribntes in residne classes with modnlns np to Q, then Proposition 14.11 
implies that the average of ( 7 ( 11 ) over h G "H is 0 ( 1 ) for any w, and is 1 + o(l) if w —)■ 00 . 

Corollary 4.4. Let the notations and assumptions be as in Proposition 4-l\ Suppose that 
for each squarefree q < Q, we have I '^(h)) < ^ for some constant C = 0(1). 

Then for any e > 0 we have 

Ehe-K 5 '(h) = 1 + 0 (m;“^ log ‘^^^^(2 + w)) + O^ (Q~^ max |A(h)|‘ 

\ hew 

Proof. In view of Proposition 14.11 it snffices to show that 

^ log ^(^)(2 + w) 


q>w 


q^ 


w 


Indeed, by Rankin’s trick, this snm is bonnded by 


E - 

^ \w 


q\ 1-1/login 




1 


-E 

in 


pP{q)C‘^^^^ C(1 + 1/(logtc)'^*^^) 


w ^' q 

Q 


1 + 1 / logui 






W 


W 


as desired. 


□ 


5. Pseudorandomness of the prime majorant 


In this section we prove Proposition 12.61 nsing the majorant i'x,R/w,b constrncted in 03.31) 
with R = . The lower bonnd on z/^,_r,vv ,6 clearly follows from 03.4p . 

Now £x a system of distinct affine linear forms 4' = (i/’i,-- - ,i/’t) : Z'’* —)■ Z* with 
t < to. Note that each x G Z'’* indnces a partition 7 r(x) of [t], according to the valnes 
i/>i(x), • • • Precisely, two indices i,j G [t] lie in the same atom of vr(x) if and only if 

i/’j(x) = 'ijij{'x.). For each partition vr of [t], let X{7t) be the set of x G Z'’* with 7 r(x) = vr. It 
snffices to show that 

1 \ tV 

—^ ^neG n + V’i(x)) = l\n\=t + 0(1), 

' ' xeonA(7r) j=i 

for each partition vr of [t]. For the remainder of this section, we £x the partition vr and write 
simply X for A( 7 r). Let s = | 7 r|. We may assnme that hi D X is nonempty. The implied 
constants in this section are always allowed to depend on d, t, 'L, tt, X. 

From the dehnition fl3.3l) of i'xAW,bi we need to show that 


|finx| 

IfinZ'^l 


( y 

VhFlogRy 


t 

ExGOnxEngG nAx,B(H^(n + ^,(x)) + 6)= 

1 = 1 


ls=t + o(l)- 
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By Proposition I3.3[ the inner average over n above is 

(^) »W)6».('I'(x)) + o(B(>I'(x)))| + 0(/V-‘fi‘(logii)‘). 

The last error term above is negligible by the choice of R. Thus we need to show that 

(5.1) 

‘ ■ >i.(x))c;^(>i-(x)) + o(b(>i-(x)))] = i,., + o(i). 

It is convenient to introduce the linear variety 1/ C consisting of those vectors x G 
whose induced partitions 7r(x) are the same as or coarser than tt, and let X = 1/ fl 1/. In 
other words, X is the set of x G satisfying '0j(x) = ipji'x) whenever i,j he in the same 
atom of TT. Note that X C X always, and X = Z'^ when s = t. 

Lemma 5.1. Let Li (Z L 2 C. be two lattices (not necessarily of full rank). Let Q zM.'^ be 
a convex body with inradius r(f2) > 2. Then 

nLi| <d,Li,L2 nL 2 I. 


Proof. Via a linear transformation (depending only on d and L 2 ), we may assume that L 2 is 
the standard lattice (y ]^dim(.f' 2 ) naturally embedded in By restricting to 

we may assume that L 2 = T/'. With these assumptions we may use the following covering 
inequality in convex geometry (see m Lemma C.4]): 

(5-2) ExgnnZ‘*/(x) ^ sup Exg(y+[_r(n)_r(o)])nZ'^/(^)) 

j/SM'* 

applied to the function / = . It thus suffices to show that the probability that a random 

point X in a d-dimensional box of side lengths 2r(f2) lies in Li is This 

is clear, since any point x G Ti is determined by dim(Li) of its coordinates, and there are 

0(r(f2)) ways to choose each of these coordinates. □ 


Lemma 5.2. Let X and X be as above. Let Q Z be a convex body with inradius r{Q) > 2. 
Then 


n x| = (1 + 0(r(fi)-^))|f2 n X 


Proof. Note that X is obtained from X by removing a few linear subvarieties Vi, V 2 , • • ■ from 
V. Since X is non-empty, these subvarieties have codimension at least 1 in V. By Lemma [5T] 
applied to (suitable translates of) V) hi Z'^ and V fl Z'^, we obtain 

|fin vnz'^l <r(f2)-^|f2n vnz'^l 

for each i. This gives the desired conclusion. □ 

Lemma 5.3. Let L Z be a lattice (not necessarily of full rank). Let Q Z be a 
convex body with inradius r(r2) > 2. For any positive integer q and any function / : L —)• M 
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satisfying /(x + m) = /(x) for any x G L and m G qL, we have 

ExennL/(x) = + Od,L ^^€L/gLf{^)- 

Proof. By a linear change of variables, we may assume that L C is the lattice spanned 
by the standard basis vectors ei, • • • , CdimL- After projecting to the hrst dimL coordinates, 
we may assume that d = dimL and L = The assertion then becomes m Corollary 
C.3]. □ 


Corollary 5.4 (Equidistribution in residue classes). Let the notations be as above. Let Pi = 
{T(x) ; X G flnX}. Then for any squarefree q < r(f2), we have PheuiQ I A(h)) < C‘^^'^'>q~^ 
for some constant C = > 0. 


Proof. We may assume that q is sufficiently large. For each r G (Z/gZ)'^, let X(g, r) C X 
be the sublattice consisting of those x G X with x = r (mod q). Let Rg be the set of 
r G {'L/q'LY satisfying 

n (r) - V’i (r)) = 0 (mod g), 

where the product is taken over all pairs (?, j) such that i,j he in different atoms of the 
partition tt. Thus g | A(h) if and only if h = T(x) for some x G flflX with x (mod g) G Rg. 
It suffices to show that 

|nnx(g,r)| <- \nnx\. 

reRg ^ 


By Lemma f5.3l applied to (a suitable translate of) X and the function /(x) 
we have 


nX(g,r)| 

\nnx\ 




(mod g)) < g 


lx=r 


(mod q ); 


where the second inequality holds since g is sufficiently large depending on X. Combining 
this with Lemma 15.21 we obtain 


|nnx(g,r)| = q-^\nnx\. 

It thus suffices to show that |Rg| < When g is prime, Rg is the union of at most 

hyperplanes in (Z/gZ)'^ cut out by equations of the form tjji = ijjj (mod g). The desired 
bound |Rg| -C g‘^“^ follows in this case, since each such hyperplane contains 0{q^~Y points 
(recall that the implied constants here are allowed to depend on T). For general squarefree 
g, the conclusion follows by multiplicativity. □ 


With these lemmas in hand, we may now prove (15.ip and thus complete the proof of 
Proposition 12.61 Since 12 C [—r(12)‘^*^^\ we have 


max |A(\['(x))| <C r(12)‘^^^\ 
xeonZ'^ 
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In view of Corollary 15.41 we may apply Corollary 14.41 with Q = (say) to obtain 

= 1 + o(l), 

and 

ExeonA^(^(x)) = 0(1). 

To prove fl5.ip . we divide into two cases according to whether s = t or s < t. If s = t, then 

|finx| = (i + o(r(fi)-^))|finz'^| 

by Lemma 15.21 Since the values of 'ipi{'x.) are all distinct for x G X in this case, the sieve 
factor C;^(\['(x), 4/(x)) is the product of copies of c^{2), and hence equal to 1. Thus the left 
side of fl5.ll) is 

(1 + 0(r(fl)-i)) [E,eonAe^w(^(x)) + o {E^^nnxE{^{^)))] = 1 + o(l), 
as desired. In the case when s < t, by Lemma [5T] the left side of fl5.1l) is bounded by 

^(^)-codim(X)(iog^)t-. _ 

by the hypothesis r(f2) > g(N){\og and Dehnition 12.41 This completes the proof. 

6. Determining the constant L(T) 

In this section we prove Proposition [531 It will be convenient here to parametrize the linear 
forms in 4/ differently in the following way. For v & {1,2, ■ ■ ■ , k} and I C {1, 2, • • • ,k}, dehne 

y) = X](t - i)xi + ^{v - i)yi, 

iGl 

for X = (ti, • • • , Xk) G and y = {yi, ■ ■ ■ ,yk) ^ Since the coefficients of and y^ in 
'ipvj are always 0, we have 'ipvjvj{v} = '^vjxiv}- For each G 4/, dehne vigp) G {1, 2, • • • ,k} 
and liyp) C {1, 2, • • • , A;} by the condition that -0 = We impose the constraint that 

n('^) G /(V’), so that n('^) and /('^) are uniquely determined by V’- Proposition 12.51 clearly 
follows from the following two propositions. 

Proposition 6.1. For any linear subvariety 11 C {(x, y) : x, y G M^} of codimension 1, the 
number of distinct linear forms in 4/ when restricted to 11 is at least {k + 1)2^“^. 

Proposition 6.2. For any linear subvariety 11 C {(x, y) : x, y G M^} of codimension 2, the 
number of distinct linear forms in 4/ when restricted to 11 is at least 2^“^. 

We will prove them in Sections 16.21 and 16.31 after developing a few preliminary lemmas in 
Section (nm In this section we always use - ■ ■ to denote linear forms in 4/ instead of 

the ones dehned in fl2.1l) . 
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6.1. Dependencies among linear forms in \I/. For a collection {-01, • • • , 'ips] C H/ of linear 
forms, denote by n(' 0 i, • • • , i/’s) the linear subvariety consisting of those (x, y) such that the 
values y) are all identical for 1 < i < s. Generically we expect n('^i, • • • ,'^s) to have 
codimension s — 1. The following lemmas classify a few non-generic cases. 

Lemma 6.3 (Non-generic case of three linear forms). Let G T 6 e three distinct 

linear forms. If 1 , 1 ^ 2 ,'f’s) has codimension 1, then 'ipi,'ip 2 ,'f ’3 share a common set of 
variables. 

Here and later, we say that a collection of linear forms {ipi, • • • , ips} C T shares a common 
set of variables, if there exists a subset / C {1, 2, • • • ,k} such that /(V'j) = / U {n(' 0 i)} for 
each 1 < j < s. In other words, all linear forms i/’i, • • • , 1^3 depend only on the variables 
{xt: i e 1} and {i/j : i ^ /}. 

Proof. Write Vj = v{fjj) and Ij = li'ipj) for j G {1, 2, 3}. Since Il{'ipi,'ip 2 , t/’s) has codimension 
1, there exist nonzero constants Ci, C 2 , C 3 G M with Ci -|- C 2 + C 3 = 0, such that 

Cl'01 + C2'lp2 + CSV’S = 0. 

Examining the coefficients of Xi and yt in the above equation, we obtain 

(6.1) ci(z - Ti)l*e/i + C2(i - V2)liei2 + C3{i - V3)li<zi^ = 0 
and 

(6.2) ci{i - + C2{i - V2)li02 + C3{i - V3)h03 = 0 

for each 1 < i < k. Let I = Ji fl J2 fl J3. We show that Ji = / U {ci}, and thus similarly 
/2 = / U {V2} and Is = / U {^3}. To this end, we pick an arbitrary V G Ji \ I, and prove that 
ii = Vi. Since ii ^ I, ii lies in at most one of I 2 and J3. If ii lies in neither I 2 nor J3, then 
fib.ip with i = ii yields 

ci(H - fl) = 0. 

Since ci 7 ^ 0, we have ii = vi as desired. 

Now assume that H lies in exactly one of I 2 and I 3 . Without loss of generality, assume 
that ii G I 2 and ii ^ Is. Then fl6.2p with i = ii yields 

C 3 (h - P3) = 0. 

Since C3 7 ^ 0, we have ii = T3, but this contradicts our restriction that T3 G Is. □ 

Lemma 6.4 (Non-generic case of five linear forms). Let f)i, - ■ ■ he five distinct linear 

forms. //n('i/’i, • • • , Vs) has codimension at most 2 , then three of them share a common set 
of variables. 

Proof. Write Vj = v{f)j) and Ij = li'ifj) for 1 < j < 5. Suppose, for the purpose of 
contradiction, that no three of Vi)''' > Vs share a common set of variables. Let / = Ji fl 
■ ■ ■ n J5. We show that Ji = / U {ci}, and thus similarly Ij = I U {uj} for each 2 < j < 5. 
To this end, we pick an arbitrary V G /i \ I, and prove that V = vi. We divide into cases 
according to whether V lies in / 2 , • • • , Is or not. 
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First assume that ii lies in none of hr '' yh- Since n('^i, • • • ,^ 5 ) has codimension at 
most 2, in particular n('0i, • • • ,-04) has codimension at most 2. Hence there exist constants 
Cl, • • • , C 4 G M, not all zeros, with Ci + • • • + C 4 = 0 , such that 

(6.3) C101 + C202 + C303 + C404 = 0. 

Examining the coefficients of in the above equation, we obtain 

(6.4) ci(zi-ni) = 0. 

If Cl = 0, then we may apply Lemma 16731 to 02 , 03,04 to conclude that 02 , 03,04 share a 
common set of variables, a contradiction. Hence ci 0 0, and thus 0 = vi as desired. 

Next assume that 0 lies in exactly one of 0, • • • , 0, say 0. Repeat the argument above 
with 01,03,04,05 (instead of 01,02,03,04) to arrive at fl6.4p again. 

If 0 lies in exactly two of 0, • • • , 0, say 0 and 0, then by examining the coefficients of 
Hr ill 116.3p we obtain 

04(0 — vh) = 0 . 

If C 4 = 0, then Lemma [6.31 implies that 0i,02,03 share a common set of variables, a contra¬ 
diction. Hence C 4 0 0, and thus 0 = ^ 4 , but this contradicts our restriction that G 0. 

Finally, if 0 lies in exactly three of 0, • • • ,0, say 0,0,0, then repeat the argument in 
the previous case with 01 , 02 , 03,05 to arrive at 0 = ^ 5 , again contradicting our restriction 
that Us G 0. □ 

Lemma 6.5 (Non-generic case of linear forms restricted to a hyperplane). Let I C 
{ 1 , 2 , • • • , k} he a subset and H/ be a subspace defined by 

H/ = < (x, y) : ^ Xi + ^ 2 /i = 0 
L i^I 

Let 01 , • • • , 0s be linear forms in 4/, and let fii, - ■ ■ , 0s be their restrictions to H/. Suppose 
that 01 , • • • , 0 s are all distinct, and that n( 0 i, • • • , 0 s) has codimension at most 1 in Hj. 
Then s < k. 

Proof. Without loss of generality we may assume that / = {1, 2, • • • , k}, so that H/ is cut 
out by the equation xi -|- • • • -f x^ = 0. Write Vj = vippj) and 0- = lipfj) for 1 < j < 5. 
It suffices to prove the assertion that each index 0 belongs to either none of 0-, or all of 
0, or exactly one of 0-. Indeed, suppose that this is proved, and let 0 be the intersection 
0 n ■ ■ • n 0. Suppose that 0 = ■ ■ • = 0 = 0 and 0- 0 0 for t < j < s. By the assertion, 
each index not in 0 can appear in at most one of 0+1, • • • , 0. Since each set in 0+1, • • • , 0 
contains an index not in 0, we deduce that s — t < k — |0|. Since 0i, • • • ,0t are distinct 
and h = ■ ■ ■ = It, the values xi, • • • ,Vt must be distinct, and thus t < |0|. It follows that 
s < k as desired. 

To prove the assertion, suppose that h E h, h E h, and 0 ^ 0 for some 1 < 0 < fc. Since 
n(0i,02,03) has codimension at most 1 in H/, there exist nonzero constants Ci, 02,03 G M 
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with Cl + C 2 + C 3 = 0 , such that 

Ci^l + C2^2 + cs-^s = 0. 

It follows that 

(6.5) 01-01 + C202 + 03-03 = c{xi H - V Xk) 

for some c G M. Examining the coefficients of i/jg in the above equation, we obtain 

C3(*0 - v-i) = 0 . 

Since 03 7 ^ 0, we have io = V3, contradicting the fact that V3 E I 3 . □ 

6.2. Proof of Proposition [67ll Suppose that there is a subspace 11 C {(x, y) : x, y G M*'} 
of codimension 1 such that the number of distinct linear forms in 'h when restricted to 11 is 
at most {k + 1)2*'“^ — 1. Partition lb into m < {k + 1)2*^“^ — 1 subsets Ibi, • • ■ , according 
to their restrictions to 11. In other words, the restrictions of -0 G and -0' G It'y to 11 are 
identical if and only if j = j'. 

Case 1. First suppose that no two forms in the same subset share a common set of 
variables. Then Lemma 16.31 implies that each Tj contains at most 2 forms. We may write 

n = {(x, y) : 01 (x, y) = 02 (x, y)} 

for two distinct forms 0 i ,02 lying in the same Tj. We count the number of pairs ( 0002 ) 
with 

( 6 . 6 ) 0'l - 02 = C(0i - 02) 

for some c G M, and it suffices to show that this number is at most {k — 1)2^“^. Equivalently, 
we show that the number of forms not belonging to any pairs is at least 2^“^. We divide 
into two cases. 

If n(0i) = n( 02 ) = V, then the equation 0i = 02 dehning 11 is of the form 

(6.7) - v){xi - Hi) = 0, 

i£l 

for some / C [fc] \ {n} and Oj G {±1}. In fact, I is the set of indices lying in exactly one of 
/( 0 i) and /( 02 ). If 06.61) holds, then -u(00 = 'p(00 = v', and the equation 0 J = 02 is of the 
form 

( 6 . 8 ) '^e'i{i-v'){xi-yi) = 0 

i£l' 

for some e[ G {±1}, where /' is the set of indices lying in exactly one of /(0() and 7(00. 
Since 06.71) and 06.8p are the same, we must have 1 = 1' and v' ^ /, and moreover either 
e' = Ei for alH G / or e' = — e* for all i E I. Thus for hxed v' ^ /, the number of choices for 
the unordered pair {/( 00 ,/( 00 } is at most (since v' must lie in /( 0 () and 1 ( 00 ). 

Thus the number of (unordered) pairs { 0002 } satisfying 06.6p is at most 

{k- |/|)2'=-^-l-^l < (fc- l)2'=-0 
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as desired. 

Now assume that u('0i) 7 ^ v{'ip 2 )- If (I 6 . 6 p holds, then u('0i) 7 ^ 'i^(' 02 ) I^ if 

the coefficient of some variable Xi or Ui in 'ipi is nonzero, so is its coefficient in 'ipi — V’ 2 - The 
same goes for — 1 /’ 2 - Since 'ipi and V ’2 do not involve at least two variables or yv{' 4 )-^) 

together with a:^(^ 2 ) yv{ii) 2 ))) neither ^jJ[ nor 'ip 2 is allowed to depend on these two variables. 

There are certainly at least forms in T involving either of these two variables, and they 
must appear as singletons in the partition Ti U ■ ■ ■ U as desired. 

Case 2. Now assume that two forms in some subset Tj share a common set of variables. 
Then 11 must be of the form 

n = < (x, y) : ^ Xi + ^ i/i = 0 
I iei i^i 

for some I C {1, 2, • • • ,k}. If two forms t/’i, "02 be in the same then 'ipi and '^2 sxe identical 
on n. Thus they must share the common set of variables {x, : i E 1} and {i/i : i ^ I}. There 
are certainly at least 2 ^“^ forms in T involving other variables, and they must appear as 
singletons in the partition Ti U ■ ■ ■ U as desired. 

6.3. Proof of Proposition 16.21 Suppose that there is a subspace 11 C {(x, y) : x, y G M^} 
of codimension 2 such that the number of distinct linear forms in T when restricted to 11 
is at most — 1. Partition T into m < 2^~^ — 1 subsets Ti, • • • , Tm according to their 
restrictions to 11. In other words, the restrictions of "0 G Tj and "0' G to 11 are identical 
if and only if j = j'. We divide into two cases. 

First suppose that no two forms in the same subset Tj share a common set of variables. 
Then Lemma [6.41 implies that each Tj contains at most 4 forms. Since m < 2^~^ — 1, this 
can happen only if /c = 3, in which case m = 3 and Ti, T 2 , T 3 all contain exactly 4 forms. 
By our assumption, for any choice of Zi G {xj,?/*} {i = 1,2,3), the three forms —Z 2 — 2z3, 
Zi — 2 : 3 , and 2zi + Z 2 lie in distinct Tj. Since —Z 2 — 2z^, zi — z^, 2zi + Z 2 form an arithmetic 
progression with common difference Zi + Z 2 + z^^ it follows that Zi -\- Z 2 + z^ restricted to 
n are identical up to sign for any choice Zi G {x*,?/*}. This contradicts the fact that 11 has 
codimension at most 2 . 

Now suppose that two forms in some Tj share a common set of variables, so that 11 C 11/ 
for some J C [/c]. For -0 G T denote by 'ip its restriction to 11/. The number of distinct ip as ip 
ranges over all forms in T is easily seen to be fc ■ 2^~^ — [k — 1). By the pigeonhole principle, 
there must be fc + 1 distinct forms ipi, - ■ ■ ,ipk+i whose restrictions to 11 are identical, but 
this contradicts Lemma [6.51 

7. Relative Szemeredi’s theorem for narrow progressions 

To prove the relative Szemeredi’s theorem for narrow progressions fTheorem ll.4|) . it suffices 
to prove the following transference principle. 

Theorem 7.1 (Transference). Let k > 2 be a positive integer. Let N be a sujficiently large 
prime, and let G = Z/A^Z. Let f,i> : G ^ M. be functions satisfying 0 < / < z/. Let 












NARROW ARITHMETIC PROGRESSIONS IN THE PRIMES 


19 


D, S > 2 be positive integers satisfying S = o{D). Suppose that v satisfies the k-linear forms 
conditions with width S. Then there exists a function / : G —)■ [0,1] with E/ = E/ + o(l), 
such that 




Proof of Theorem l.f assuming Theorem \7.1\ Apply Theorem 17.11 to obtain the bounded 
function /. Since E/ > 6 we have E/ >6/2, and it suffices to show that A£)(/, • • • , /) 1- 

For each m E G, let fm ■ [71] —t [0, 1] be the function dehned by fm{n) = f{m + n). Let 
M C G be the set of m G G with E/^ >6/4:. From the inequalities 

2 — = EmecE/m — ^ + "1^’ 

we conclude that \M\ > 6\G\/4. For each m G G we apply (the quantitative version of) 
Szemeredi’s theorem (see for example [71 Proposition 2.3]) after embedding [D\ into a cyclic 
group to obtain 

®^n,cig[T)]/'m(ll)/'m(lI T d) • • • fmi,^ T (^ 7)d) 3>fc,<5 1- 

Here we naturally set fm{n) = 0 for n ^ [D]. Averaging this over all m G G, we arrive at 
Era&G^n,d&[D]f{m + u)f {m + u + d) ■ ■ ■ f {m + n+{k- l)d) :$>k,s 1- 
This is equivalent to the desired claim A£)(/, • • • , /) 1 after a change of variables. □ 


The proof of Theorem 17.11 motivated by arguments in [HdH] , is split into two parts. In 
the hrst part, we find a bounded model / : G —)■ [0,1] for / in the sense that ||/ — /||d is 
small, where the norm || • ||d is dehned as follows. 


Definition 7.2. Fix a positive integer k >2. For any function / : G — )■ M and any 1 < i < k, 
dehne 

||/||D,i = SUp|AD(/i,-- - ,/i_i,/,/i+i,--- ,/fc)|, 
where the supremum is taken over all functions /i, • • • , fi-i, fi+i, ■ ■ ■ , fk ■ G ^ [—1,1]. 
Furthermore, dehne 

II/I|d= sup ||/||D,i- 

l<i<k 

It can be easily verihed that these are indeed norms; however, we will not need this fact. 


Proposition 7.3 (Approximation by bounded functions). Let k > 2 he a positive integer. 
Let N be a sufficiently large prime, and let G = Z/A^Z. Let /, z/ : G —)■ M &e functions 
satisfying 0 < f < u. Let D,S > 2 be positive integers satisfying S = o{D). Suppose 
that V satisfies the k-linear forms conditions with width S. Then there exists a function 
/ : G —)■ [0,1] with E/ = E/ + o(l), such that \\f — fWo = o(l)- 

In the second part of the proof of Theorem 17.11 we show that the k-AP counts for the 
original function / and for its bounded model / are close. 
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Proposition 7.4 (Counting lemma). Let k >2 be a positive integer. Let N be a sufficiently 
large prime, and let G = 'LjN'L. Let D,S > 2 be positive integers satisfying S = o{D). 
Let u : G ^ M. be a function satisfying the k-linear forms conditions with width S. For 
I < i < k, let fi, /i, z/j : G —)■ M be functions with Vi G {v, 1}, 0 < /* < z/j and 0 < fi < 1. If 
Wfi ~ fiWn,! = o(l) for each I < i < k, then 


Ad(/i, • • • ) fk) — ^nifi, ■■ ■ ,fk) 


0 ( 1 ). 


Clearly Theorem 17.11 follows by combining Propositions 17.31 and 17.41 The proof of Propo¬ 
sition [7]3l presented in Section IH follows closely the proof of [HI Lemma 3.3], using the 
Green-Tao-Ziegler dense model theorem. The proof of Proposition 17.41 presented in Section 
IHl follows closely a densification argument in [H Section 6]. 


8. The dense model theorem 

In this section we prove Proposition 17.31 The main tool used is the Green-Tao-Ziegler 
dense model theorem. Indeed, for each 1 < i < k a. straightforward application of this 
dense model theorem produces a function fi such that \\f — fi\\D,i = o(l). However, some 
extra efforts are needed to obtain a single model / that is close to / in the norm || ■ j 
for every i. To achieve this, we dehne the following stronger notion of closeness (compare 
with [T81 Definition 3.1]). 


8.1. Discrepancy pairs. 


Definition 8.1 (Discrepancy pair). Fix a positive integer i and a linear form —)■ Z in £ 

variables. Let S' > 2 be a positive integer and £ > 0 be real. For two functions /, / : G ^ M, 
we say that (/, /) is an ^-discrepancy pair with width S with respect to f, if for all functions 
Ml, • • • ,U£ : G^~^^ [—1,1] with Ui not depending on the (i -|- l)th coordinate, we have 


( 8 . 1 ) 


£ 

EnGGlEsg[ 5 ]£ (^f{n + f{s)) -/(n + ^(s))j JjMi(n, s) 

i=l 


< e. 


Note that, if s = (si, • • • ,se), then the value of Ui{n,s) does not depend on s*. We will 
be interested in discrepancy pairs with respect to fj and fjj dehned in fl2.ip and fl2.2p . Note 
that "0 is a linear form in k variables, while each xjjj is a linear form in fc — 1 variables. 
The following two lemmas imply that discrepancy pairs with respect to are automatically 
discrepancy pairs with respect to every fjj. 

Lemma 8.2. Let ^ : Z^ —)■ Z 6e a linear form in £ variables, and let f : Z^“^ —)• Z 6e the 
linear form defined by .^'(s) = .^(s, 0) for any s G Z^“^. Let S >2 be a positive integer and 
e > 0 be real. If (/, /) is an e-discrepancy pair with width S with respect to f, then (/, /) is 
also an e-discrepancy pair with width S with respect to f'. 
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Proof. Let u[, ■ ■ ■ [—1,1] be arbitrary functions with u' not depending on the 

{i + l)th coordinate. By dehnition, it suffices to show that 

e-i 

E„ggEsg[s]^-i (^f{n + - f{n + ^'(s)) j JJ u[{n, s) < e. 

i=l 

Introducing a new variable si G [^j, and note that .^(s, = ,^'(s) + as^ for some a G Z. 

After translating n by as^ and averaging over we may rewrite the average above as 

e-i 

EnGGEsG[s]^-iE,,6[5] {j{n + ^(s, s^)) - /(n + ^(s, s^)) j n u[{n + as£, s). 

i=l 

This can be further rewritten in the form 


£ 

EneGEs=(si,..,.,)e[sy + - /(n + C(s))j JJni(n,s), 

i=l 

where n* is dehned by nj(n, Si,--- ,-s^) = u'iip + as£,si,--- ,se-i) for i < i — I, and 
ue{n, si, ■ ■ ■ ,se) = 1, so that Ui{n, si, ■ ■ ■ , se) does not depend on Si for every i. Thus 
the average above is indeed bounded by e since (/, /) is an ^-discrepancy pair with respect 
to f. □ 

Lemma 8.3. Let he two linear forms in i variables defined by 

'C('Si) ■ ■ ■ ) + • • • + o-eSe, ^'{si, ■ ■ ■ ,se) = a[si 

for some ai, - ■ ■ , ag, a[, - ■ ■ , G Z \ {0} such that a[ divides Oj and ai/a'^ divides Q for each 
i, where Q is a positive integer. Let S > 2 be a positive integer and e > 0 be real. If (/, /) 
is an e-discrepancy pair with width S with respect to f, then (/, /) is also an e-discrepancy 
pair with width QS with respect to f'. 

Proof. Let [—1,1] be arbitrary functions with u[ not depending on the 

{i + l)-th coordinate. Write S' = QS. By dehnition, it suffices to show that 

i 

E„ggEsg[s/]^ {fin + ^'(s)) - f{n + ^'(s))) JJ ^'(n, s) < e. 

i=l 

For each i let g* = |aj/a'|. Since qi divides Q, we may split the interval [S"] into Q arithmetic 
progressions Pn, • • • , Pjg, each of which has length S and common difference qi. It suffices 
to show that, for any choice of Pi G {Pn, • • • , Piq} we always have 

£ 

E„eGEsGPix...xP, (^/(n + ^'(s)) - 7(n + ^'(s))) JJn'(n,s) < p. 

i=\ 

Write s = (si, • • • , s^). For s* G P*, make the change of variable Si = qiti + ri with t* G [S']. 
Since .^'(s) = ,^(t) -|- r where t = (ti, • • • , and r = a'^ri a'f^, the inequality above 



22 


XUANCHENG SHAO 


is equivalent to 


EneGEtg[5]^ 


2=1 


< 


where Ui{n,ti, - ■ ■ , ti) = u[{n — r, qiti + ri, • • • , q^ti + rg). This follows from the assumption 
that (/, /) is an ^-discrepancy pair with width S with respect to since Ui does not depend 
on ti. □ 

Lemma 8.4. Let S, D > 2 be positive integers with S = o{D). If {f, f) is an o{l)-discrepancy 
pair with width S with respect to fji for some 1 < i < k, and moreover \\f — fW^i = 0(1), 
then 11/ - f\\D,i = o(l). 

Proof. Without loss of generality, we may assume that i = 1, so that fji is a linear form 
in the k — 1 variables S 2 , • • • , Sfc dehned in fl2.1l) . Let / 2 , • • • , fk ■ O —)■ [—1,1] be arbitrary 
functions. Fix an arbitrary Si G Z. For 2 < i < k, dehne the function Ui : ^ [—1,1] by 

Ui{n, S 2 ■ • • , Sfc) = fi{n + • • • , Sk)). 

Note that Ui does not depend on s*. Since (/, /) is an o(l)-discrepancy pair with width S 
with respect to 'ipi, we have 


( 8 , 2 ) 




f{n + V^i(s)) - f{n + /'i(s)) j Yl fii^ + V'i(s)) 

i=2 

for every Si G Z, where s = (si, • • • ,Sk). Note that 

k 

Ad(/ - 7, /2 , • • • , fk) = EneG^delD] (^f{n) - f{n)^ Ylfi{n + {i- l)d). 


= o(l), 


i=2 


Introduce new variables S 2 , • • • ,Sk taking values in [S']. Shifting d by S 2 + • —h causes an 

error bounded by 

O f{n) - 7(n) ^ = o (\\f - = o(l) 

by hypothesis. Thus 

k 

Ad(/- 7 / 2 ,--- ,/fc) = Edep]E„ 6 GE. 2 ,..,sfcG[s] (^f{n) - f{n)^ JJ/i(n+(i-l)(d+S 2 +-• •+Sfc))+o(l). 

i=2 

After renaming d by Si and replacing n by n -|- V’i(s), we may transform this into 

k 

E^^ 6 [o]E„gGE^ 2 ,...,^j^G[s] (fin + ipiis)) - f{n +'ipi{s))') Y[ fi{n + fjiis)) -ho(l). 


i=2 


By fl8.2p . for each si the inner average above is o(l). This completes the proof. 


□ 
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8.2. Proof of Proposition 17.3L For a positive integer S > 2, let Ts be the collection of 
all functions that are convex combinations of functions m : G —)■ M of the form 

k 

(8.3) M(n) = - i/>(s),s) 

i=l 

for some Mi, • • • ,Mfc : —)■ [—1,1] with Ui{n,s) not depending on Sj, where -0 is dehned 

in fl2.2p . In view of Lemma 18^ to prove Proposition 17.31 it suffices to find / such that (/, /) 
forms an o(l)-discrepancy pair with width o{D) with respect to each By Lemmas 18.21 
and 18.31 it suffices to find / such that (/, /) forms an o(l)-discrepancy pair with width o{D) 
with respect to ijj. In other words, we need to ensure that 


(/ - f,u) 


0 ( 1 ) 


for any u G J^o{d)- This will be achieved by the Green-Tao-Ziegler dense model theorem [7|[T5] 
(with simplified proofs in [61113]). 

Lemma 8.5 (Green-Tao-Ziegler dense model theorem). For any e > 0, there is a positive 
integer K = K{e) and a positive constant e' = e'{e) such that the following statement 
holds. Let T be an arbitrary collection of functions m : X —)■ [—1,1] on a finite set X. Let 
o : X —)■ M>o be a function satisfying 

\{u-l,u)\ < s' 

for all u G , where consists of all functions of the form UiU 2 ■ ■ ■ uk with each Ui G T. 
For any function / : X —)• M>o with f < v and E/ < 1, there is a function / : X —)■ [0,1] 
with the properties that E/ = E/, and moreover 


for all M G X. 


if - f,u) 


< e 


In view of this, the task of proving Proposition 17.31 reduces to proving the following two 
lemmas, the first of which saying that X 5 is almost closed under pointwise multiplication, 
and the second of which verifies the hypothesis in Lemma 18.51 about the majorant u. 


Lemma 8.6. Let K be a positive integer and e G (0,1) be real. Let S,T > 2 be positive 
integers with T < eS. For any function u G Xg , there is a function v G Xr satisfying 
||m - m||oo = 0{Ke). 

Proof. It suffices to prove this when u = U 1 U 2 ■ ■ ■ uk and each Uj : G ^ [—1,1] is of the form 

k 

Uj{n) =E,g[ 5 p JjMji(n-'0(s),s), 
i=l 
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for some Uji : —)• [—1,1] with Uji{n, s) not depending on Si, since every function in Tg 

is a convex combination of these functions u. We may write 

K k 

u{n) = Y[ JJ Uji{n - 'ip{sj), Sj). 

j=li=l 

Introduce the auxiliary variables t = {ti, ■ ■ ■ ,tk) & [T]^, and note that translating each sji 
(1 < j < K) by ti changes the average by 0{Ke). Thus 

k K 

u{n) = JJ V'(sj) -+ t) + 0{Ke). 

i=l j=l 

For hxed Si, • • • ,sk G [*S']^, consider the function : G —)■ [—1,1] dehned by 

k 

v{n) = Ete[r]»= 

i=l 

where Vi : [“1)1] is dehned by 

K 

Vi{n,i) = Y\uji{n- +t). 

i=i 

Thus we have approximated m by a convex combination of these functions n, up to an error 
of 0{Ke) in the L°°-norm. Since Vi{n,t) does not depend on ti, we have v G J^t- This 
completes the proof. □ 


Lemma 8.7. Let S > 2 be a positive integer. If v satisfies the k-linear forms conditions 
with width S, then 

\{p-l,u)\ = o(l) 


for any u G J-s- 


Proof. It suffices to prove this for u of the form (18.3^ . We may write 

k 

fv -\,u) = E„gGEsg[s]'= + fi{s)) - 1) JJ ufin, s) 

i=l 

after a change of variable, where Ui{n,s) does not depend on Sj. To upper bound this, we 
will apply Cauchy-Schwarz inequality k times, with respect to the variables s* in the ith 
step. Since Ui does not depend on Si, the Cauchy-Schwarz step with respect to Si eliminates 
the function Ui. In the end we arrive at 

|(z/- 1 ,m)|^'' < E„6GlEs(0)^s(i)g[5]fe JJ {u{n + - l) . 

This is o(l) after expanding out the product since u satishes (the second set of) the fc-linear 
forms conditions with width S. □ 
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Proof of Proposition \ 1. Let e = ^(A^) > 0 be a function decaying to zero sufficiently slowly. 
Since v satisfies the fc-linear forms conditions with width S', we have by Lemma 18.71 

\{u-l,v)\ = o(l) 

for any v G J-s- Choose a positive integer S' such that S' = o{D) and S = o(S'). Let 
K = K{e) and e' = s'{e) be constants from the Green-Tao-Ziegler dense model theorem. 
For any function u G we have by Lemma [8.61 an approximation v G J^s satisfying 

||p - m||oo = o{K) < e'/A, 

provided that £ decays slowly enough compared to the decay rate in S' = o(S"). Thus for 
any u E iFg, we have 

\{u-l,u)\ <o{l) + ^e'\\iy-l\\Li < o{l) + < s'. 

By the Green-Tao-Ziegler dense model theorem, we may hnd f : G ^ [0,1] with the prop¬ 
erties that E/ = E/, and moreover 

\{f -f,u)\ <e 

for any u G iFs'- By dehnition, this implies that (/,/) is an e-discrepancy pair with width 
S' with respect to f). It follows from Lemmas 18.21 and that (/, /) is an e-discrepancy pair 
with width S with respect to each Since 

11 / — /||li < ll/IUi + 1 < + 1 = 2 - 1 - o(l), 

we may apply Lemma [8.41 to conclude that 

\\f - f\\D,i = 0{1) 

for each 1 < i < k. This completes the proof. □ 

9. The counting lemma 

In this section we prove Proposition 17.41 by induction on the number of indices i with 
z/j 7 ^ 1. Consider hrst the base case when z/j = 1 for all i. Note that 

k 

A.d(/i, • • • , /fc) — Ad(/i, • • • ) /fc) = ^ ^oifi, ■ ■ ■ , fi-i, fi — fii /i+i, • • • ifk)- 

i=l 

For each 1 < i < fc, since /i, • • • , /j-i, /i+i, • • • , fk are all bounded by 1, the ith summand is 
bounded in absolute value by \\fi — /i||_D,i- The conclusion follows immediately. 

We now turn to the inductive step. Assume that uj ^ 1 for some 1 < j < fc, and without 
loss of generality we may assume that z^i 7 ^ 1. We split the difference Kr,{fi,-- - , fk) — 
Ad(/i, • • • , fk) into the sum of 
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and 


E, 


'neG 


/i(n)E,ep] 


n+ ii 




n + (i 


l)d) 


G=2 


i=2 


The first expression is bounded in absolute value by ||/i — /i||_d,i = o(l) since all fi are 
bounded by 1. Thus it suffices to show that the second expression is o(l). 

To simplify the notations, dehne /[,/[ : G —)■ M by 


k 

f[{n) = Ede[D] n + (* “ /i W 

i=2 

and dehne also u[ : G similarly by 


k 

Ede[D]Ylfi{n + {i- l)d), 

i=2 


k 

u[{n) = Erfg[D] JJ z/i(n + (i - l)d). 

i=2 


Clearly 0 < f[ < u[ and 0 < /{ < 1, and our goal is to show that 

EnGG/iH(/{H - f[{n)) = o(l). 

After an application of Cauchy-Schwarz and using 0 < /i < z/, the task becomes to show 
that 

(En(^GJ^{n))EneGJ^{n){f[{n) - f[{n)y = o(l). 

Since the average of z/ is 1 + o(l), it suffices to prove the inequalities 


(9.1) 

EneG(« 2 (n) - l)if[in) - f[{n)) 

and 


(9.2) 



9.1. Proof of fl9.ip . Expanding the square and recalling the dehnitions of f[ and f[, we get 
four terms of the form 

k 

(9.3) Eneci^in) - l)E^(o)^rf(i)ep] JJ JJ fl^\n + {i - l)dd^), 

i=2 rG{0,l} 

where If suffices to show that each term is o(l). Introduce new variables 

Si E [S'] for each i > 1, and translate both d^^^ and d^^ by S 2 H— ■ + Sk- This causes an error 
bounded by 

(9.4) O I fl-XecM") + 1) Y. n n + 

\ ((i(0),d(l))eOnZ2 *=2 rG{0,l} 

where C is the region dehned by 

Q= [l,D + kS]^\[kS,D-kS]^. 
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Note that the area of is x DS. This error is o(l) since u satishes (the third set of) the 
/c-linear forms conditions with width S. Thus fl9.3p is 

k 

IEnGG(p(’^) — l)IEs=(s2,...,sfe)G[5]'=-lIErf(0),d(l)G[T>] 11 /i + (^ “ 1) + ■^2 H-^■Sfc)+o(l). 

i=2 rG{0,l} 

Now replace nhj n + 'ipi{s 2 , ■■■ ,Sk) to see that fl9.3p is 

k 

EnGGEs=(, 2 ,...,,,) 6 [ 5 ]fc-l(z/(n + 'i/>i(s) - l)E^( 0 )_d(l)gp] Yl n s)) + o(l). 

i=2 tG{0,1} 

This is o(l) using the /c-linear forms conditions on u after applying Cauchy-Schwarz inequality 
k — 1 times with respect to S 2 , •' ‘ y^k- See the following lemma for details. 

Lemma 9.1 (Gowers-Cauchy-Schwarz). Let S > 2 be real. Let v \ G ^ M>o he a function 
satisfying the k-linear forms conditions with width S. For 2 <i < k and r G {0,1}, let 
he either u or 1 and let f^'^ : G M>o be a function with f^'^ < . Let Si, - ■ ■ ,Sk > S 

be positive integers. For each 1 < i < k, define 

h = JJ (z/(n-Fi/>i(s(‘^))) - 1) 

41 )’...’,( 1 ) <..g{ 0 , 1 }M\{i} 

I A; 

n n JJ JJ + 

*=2 a;G{0,l}M\H} «=^+l ajG{0,l}M 

where the average over for sj is understood to be in the range [5*1 ft < i < k). 

Then le = o(l) for each l<i<k. 

Here we adopted the natural convention that sG) = • • • , sf^‘\ s^+i, • • • , Sfc) for cj G 

{0.1}M, 


Proof. First note that Ik = o(l) follows from the /c-linear forms conditions on u. Thus it 
suffices to show that < (1 + o{l))U for any 2 < i < k. After pulling out the terms 
involving which do not depend on the variable si, we can rewrite A_i as 


'tiGgE (0) 
*1 ’ 

... ( 0 ) J 

'^i—1 


,, n (p(n+i/^i(sG))) - 1 ) 


- 

’*^-1 


a;G{0,l}[^-ll\{l} 

£-1 

n 

n 

k 

z/f'^(n + i/>i(sG))) n n 


i—2 


i=r+la;G{0,lT-l 



By the Cauchy-Schwarz inequality in the S£ variable, we see that is bounded by the 
product of 
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and another term which, after expanding out the square, becomes exactly Ig. Since the 
expression above is 1 + o(l) by the fc-linear forms conditions on u, the desired claim follows. 

□ 

9.2. Proof of fl9.2p . Split the left side of fl9.2p into the sum of two terms 

(9.5) l))+EneG{f[{n)-f[{n)){mm{f[{n), l)-/((n)). 

The hrst term above can be bounded by 

^neG{j^'i{n) + l)Wi{n) - 1 | < E^eGWiin) - 1 ^ + 2 ■ 

This is o(l) since the L^-norm of z/( — 1 is o(l) by the following lemma. 

Lemma 9.2. Let z/( he defined as above. Then 

^neGW'iin) - Ip = o(l). 

Proof. It suffices to show that 

EneG^[{nY = 1 + o(l), EneG^[{n) = 1 + o(l). 

We prove only the hrst bound; the second bound is dealt with similarly and easier. Expanding 
the square we get 

k 

EneG^[{nf = E„gGErf(o)_d(i)G[D] n n + (* “ 

i=2 tG{0,1} 

For S 2 , • • • , Sfc G [S], we may translate both d^^^ and by S 2 + • • • + with an error in the 
form of fl9.4p (with f^'^^ there replaced by uf), which is o(l). Thus 

k 

EneG^[{nf = E„gGlE^(0)_^(i)g[^]Es2,...,^^6[5] JJ JJ Ui(n + {i- l)(sS^^ + S2 H-h Sfc))+o(l). 

i=2 rG{0,l} 

After replacing n by n — S 2 — 2s3 — ■ ■ ■ — {k — l)sk we obtain 

k 

En^G^[{n)‘^ = E„eGE^(0)^,(i)g[^]E,2,...,,,6[5] JJ JJ Ui{n + V’i(s^^P) + o(l). 

*=2 tG{0,1} 

Since z/j G {z^, 1}, this is 1 + o(l) by the fc-linear forms conditions on z/, completing the proof 
of the lemma. □ 

It remains to bound the second sum in fl9.5p . First we claim that || min(/(, 1) — f[\\D,i = 
o(l). To this end, let M 2 , • • • ,Uk ■ G ^ [—1,1] by any functions, and dehne u[ : G ^ [—1,1] 
by 

k 

u[{n) = Erfgp] ]^Mi(rz + (z - l)d). 

i=2 
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By definition of the norm || • ||d,i, we need to show that 

1) - /(, M 2 , • • • , Wfc) = o(l). 

The left side above can be written as 


E. 


ngG 


min(/((n), 1) - f[{n) u[{n) = E^gc [niin(/^(M), 1) - f[{n)] u[{n)+EneGUi{n)-f[{n))u[{n) 


The hrst term can be bounded in terms of the L^-norm of — 1, which is o(l) by Lemma 19^ 
The second term can be rewritten as 


/2, • • • , fk) — /2, • • • , fk)- 

This is 0 ( 1 ) by the induction hypothesis (since u'^ is bounded by 1). This proves the claim. 
Going back to the task of bounding the second sum in fl9.5p . we need to show that 

EneG(/((^) - 7((^))(min(/((M), 1) - J[{n)) = o(l). 

Expand it into four terms: 

^n&Gf'iin) min(/((n), 1) - En^Gf[{.n)J[{n) - E„6g7(’^) min(/((n), 1) + En&Gfiinf- 
These terms can be rewritten as 


AD(min(/(, l),/2 , • • • Jk)--^D{fi f2, ■ ■ ■ ,/fc)-AD(min(/(, l),/2 , • • • Jk)+-^Difi f2, ■ ■ ■ Jk)- 

Each of these four terms is / 2 , • • • , fk) + o(l) by the induction hypothesis, since both 

min(/(, 1) and f[ are bounded by 1 and || min(/(, 1) — /(||d,i = o(l) by the previous claim. 
This proves fl9.2p and completes the proof of Proposition 17.41 
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